Causal Reinforcement Learning: An Instrumental Variable Approach
نویسندگان
چکیده
In the standard data analysis framework, is first collected (once for all), and then carried out. Moreover, data-generating process typically assumed to be exogenous. This approach natural when analyst has no impact on how generated. The advancement of digital technology, however, facilitated firms learn from make decisions at same time. As these generate new data, analyst---a business manager or an algorithm---also becomes generator. this article, we formulate problem as a Markov decision (MDP) show that interaction generates type bias---reinforcement bias---that exacerbates endogeneity in static analysis. When are independent identically distributed, embed instrumental variable (IV) stochastic gradient descent algorithm correct bias. For general MDP problems, propose class IV-based reinforcement learning (RL) algorithms We establish asymptotic properties by incorporating them into two-timescale approximation (SA). Our formulation requires unbounded state space more importantly, Markovian noise. Therefore, techniques RL SA literature, which rely boundedness martingale-difference structure noise, do not apply. develop finite-time risk bounds, bounds trajectory stability, distribution IV-RL algorithms.
منابع مشابه
Mendelian randomization as an instrumental variable approach to causal inference.
In epidemiological research, the causal effect of a modifiable phenotype or exposure on a disease is often of public health interest. Randomized controlled trials to investigate this effect are not always possible and inferences based on observational data can be confounded. However, if we know of a gene closely linked to the phenotype without direct effect on the disease, it can often be reaso...
متن کاملVariable Impedance Control - A Reinforcement Learning Approach
One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday human environments. It is, however, not t...
متن کاملComplier-average causal effects for multivariate outcomes: an instrumental variable approach with application to health economics
In randomised controlled trials that have non-compliance with the treatment assigned, policy makers require unbiased estimates of the causal effect of the treatment received. Instrumental variable (IV) approaches provide complier average causal effects (CACE) estimates. Common IV methods such as two-stage least squares (2SLS) have not been extended to settings with multivariate outcomes. We pro...
متن کاملIdentification of causal relations in neuroimaging data with latent confounders: An instrumental variable approach
We consider the task of inferring causal relations in brain imaging data with latent confounders. Using a priori knowledge that randomized experimental conditions cannot be effects of brain activity, we derive statistical conditions that are sufficient for establishing a causal relation between two neural processes, even in the presence of latent confounders. We provide an algorithm to test the...
متن کاملA Causal Approach to Hierarchical Decomposition in Reinforcement Learning
A CAUSAL APPROACH TO HIERARCHICAL DECOMPOSITION IN REINFORCEMENT LEARNING
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Social Science Research Network
سال: 2021
ISSN: ['1556-5068']
DOI: https://doi.org/10.2139/ssrn.3792824